Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources

نویسندگان

  • Xin Zhang
  • Elke A. Rundensteiner
چکیده

Data warehouses (DW) are built by gathering information from several information sources (ISs) and integrating it into one repository customized to users' needs. Recent work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. SWEEP proposed by Agrawal et al. AAS97] is one of the more popular solutions; even though its performance is limited due to enforcing a sequential ordering on the handling of data updates from ISs by the view maintenance module. We have overcome this limitation by developing a parallel algorithm for view maintenance, called PSWEEP, that still incorporates all beneets of SWEEP while ooering substantially improved performance. In order to perform parallel view maintenance, we solve two issues: detecting maintenance-concurrent data updates in a parallel mode, and correcting the problem that the DW commit order may not correspond to the DW update processing order due to parallel maintenance handling. By decomposing SWEEP into an architecture of modular components, we then can insert a local timestamp assignment module for detecting maintenance-concurrent data updates without requiring any global clock synchronization. We introduce the negative counter concept as a simple yet suucient solution to solve the Variant-DW-Commit problem of variant orders of committing eeects of data updates to the DW. We have proven the correctness of PSWEEP to guarantee that our strategy indeed generates the correct nal DW state. An evaluation of both SWEEP and PSWEEP is given that shows that PSWEEP has the potential of multi-fold performance improvement over SWEEP depending on the number of threads supportable in the given DW system implementation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PVM: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources

Data warehouses (DW) are built by gathering information from distributed information sources (ISs) and integrating it into one customized repository. In recent years, work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. Popular solutions such as ECA and Strobe achieve such concurrent maintenance however with the requirement of quiescenc...

متن کامل

Detection and Correction of Connicting Source Updates for Materialized View Maintenance Detection and Correction of Connicting Source Updates for Materialized View Maintenance

Materialized views, often derived from several data sources, must be maintained under source changes. In a distributed context, autonomous source updates can be concurrent and thus cause erroneous maintenance results. State-of-the-art maintenance strategies issue maintenance queries to the sources and apply compensating queries to correct such errors. However, these solutions are limited to han...

متن کامل

Data Warehouse Maintenance under Concurrent Schema and Data Updates

Data warehouses (DW) are built by gathering information from several information sources and integrating it into one repository customized to users' needs. Recently proposed view maintenance algorithms tackle the problem of (concurrent) data updates happening at di erent autonomous ISs, whereas the EVE system addresses the maintenance of a data warehouse after schema changes of ISs. The concurr...

متن کامل

WPI - CS - TR - 98 - 8 August 1998 Data Warehouse Maintenance Under Concurrent Schema

Data warehouses (DW) are built by gathering information from several information sources and integrating it into one repository customized to users' needs. Recently proposed view maintenance algorithms tackle the problem of (concurrent) data updates happening at diierent autonomous ISs, whereas the EVE system addresses the maintenance of a data warehouse after schema changes of ISs. The concurr...

متن کامل

An Architecture of a Data

We present incremental view maintenance algorithms for a data warehouse derived from multiple distributed autonomous data sources. We begin with a detailed framework for analyzing view maintenance algorithms for multiple data sources with concurrent updates. Earlier approaches for view maintenance in the presence of concurrent updates typically require two types of messages: one to compute the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999